Annotating the Propositions in the Penn Chinese Treebank
نویسندگان
چکیده
In this paper, we describe an approach to annotate the propositions in the Penn Chinese Treebank. We describe how diathesis alternation patterns can be used to make coarse sense distinctions for Chinese verbs as a necessary step in annotating the predicate-structure of Chinese verbs. We then discuss the representation scheme we use to label the semantic arguments and adjuncts of the predicates. We discuss several complications for this type of annotation and describe our solutions. We then discuss how a lexical database with predicate-argument structure information can be used to ensure consistent annotation. Finally, we discuss possible applications for this resource.
منابع مشابه
Proposition Bank II: Delving Deeper
The PropBank project is creating a corpus of text annotated with information about basic semantic propositions. PropBank I (Kingsbury & Palmer, 2002) added a layer of predicateargument information, or semantic roles, to the syntactic structures of the English Penn Treebank. This paper presents an overview of the second phase of PropBank Annotation, PropBank II, which is being applied to English...
متن کاملAnnotating Modal Expressions in the Chinese Treebank
This paper reports an effort to annotate modality in the Penn Chinese Treebank. We introduce the modals and features that were annotated, and describe the phases of our working process. Along with this, we address the issues in the preparation of annotation guidelines, and present the preliminary results of the first pass. Finally, we analyze the types of disagreement, and propose directions to...
متن کاملThe CUHK Discourse TreeBank for Chinese: Annotating Explicit Discourse Connectives for the Chinese TreeBank
The lack of open discourse corpus for Chinese brings limitations for many natural language processing tasks. In this work, we present the first open discourse treebank for Chinese, namely, the Discourse Treebank for Chinese (DTBC). At the current stage, we annotated explicit intra-sentence discourse connectives, their corresponding arguments and senses for all 890 documents of the Chinese Treeb...
متن کاملDeveloping Guidelines for the Annotation of Anaphors in the Chinese Treebank
This paper describes the CTB Coreference Annotation Guidelines for annotating pronominal anaphoric expressions in the Penn Chinese Treebank. The goals of the annotation are: to provide training data for learning-based pronoun resolution tools, and to provide a \gold" standard to be used in the evaluation of pronoun resolution algorithms. The choices that were made concerning the coindexing of p...
متن کاملAnnotating Attribution In The Penn Discourse TreeBank
An emerging task in text understanding and generation is to categorize information as fact or opinion and to further attribute it to the appropriate source. Corpus annotation schemes aim to encode such distinctions for NLP applications concerned with such tasks, such as information extraction, question answering, summarization, and generation. We describe an annotation scheme for marking the at...
متن کامل